feat(firebaseai): add bidi transcript #17700

cynthiajoan · 2025-09-05T17:11:16Z

Description

Replace this paragraph with a description of what this PR is doing. If you're modifying existing behavior, describe the existing behavior, how this PR is changing it, and what motivated the change.

Related Issues

Replace this paragraph with a list of issues related to this PR from the issue database. Indicate, which of these issues are resolved or fixed by this PR. Note that you'll have to prefix the issue numbers with flutter/flutter#.

Checklist

Before you create this PR confirm that it meets all requirements listed below by checking the relevant checkboxes ([x]).
This will ensure a smooth and quick review process. Updating the pubspec.yaml and changelogs is not required.

I read the Contributor Guide and followed the process outlined there for submitting PRs.
My PR includes unit or integration tests for all changed/updated/fixed behaviors (See Contributor Guide).
All existing and new tests are passing.
I updated/added relevant documentation (doc comments with ///).
The analyzer (melos run analyze) does not report any problems on my PR.
I read and followed the Flutter Style Guide.
I signed the CLA.
I am willing to follow-up on review comments in a timely manner.

Breaking Change

Does your PR require plugin users to manually update their apps to accommodate your change?

Yes, this is a breaking change.
No, this is not a breaking change.

cynthiajoan · 2025-09-30T16:10:45Z

/gemini summarize

gemini-code-assist · 2025-09-30T16:11:13Z

This pull request introduces bidirectional transcription capabilities for Firebase AI. Key changes include:

New Configuration: Adds AudioTranscriptionConfig and integrates inputAudioTranscription and outputAudioTranscription into LiveGenerationConfig to enable transcription for both input and output audio streams.
Transcription Data Model: Introduces a Transcription class to represent transcription text and its completion status.
Live Server Content: Extends LiveServerContent to include inputTranscription and outputTranscription fields, allowing the live server to send transcription updates.
Example App Updates: The example application (bidi_page.dart) has been updated to display these new transcription messages in real-time, including logic to append new transcription segments to existing messages. The text field in MessageData was made mutable to facilitate these updates.

This feature enhances the live generation experience by providing real-time text representations of both user input and model output audio.

cynthiajoan · 2025-10-16T22:13:28Z

/gemini review

gemini-code-assist

Code Review

This pull request adds support for bidirectional transcription in Firebase AI. The changes are well-structured, introducing new configurations and handling transcription messages. I've identified a few areas for improvement, including an unused variable, opportunities to reduce code duplication, and a suggestion to maintain immutability in a data class. My detailed comments and code suggestions aim to enhance code quality and maintainability.

gemini-code-assist · 2025-10-16T22:15:23Z

packages/firebase_ai/firebase_ai/example/lib/pages/bidi_page.dart

+      if (message.inputTranscription?.text != null) {
+        final transcription = message.inputTranscription!;
+        if (_inputTranscriptionMessageIndex != null) {
+          // TODO(cynthia): find a better way to update the message
+          _messages[_inputTranscriptionMessageIndex!].text =
+              '${_messages[_inputTranscriptionMessageIndex!].text}${transcription.text!}';
+        } else {
+          _messages.add(MessageData(
+              text: 'Input transcription: ${transcription.text!}',
+              fromUser: true));
+          _inputTranscriptionMessageIndex = _messages.length - 1;
+        }
+        if (transcription.finished ?? false) {
+          _inputTranscriptionMessageIndex = null;
+        }
+        setState(_scrollDown);
+      }
+      if (message.outputTranscription?.text != null) {
+        final transcription = message.outputTranscription!;
+        if (_outputTranscriptionMessageIndex != null) {
+          _messages[_outputTranscriptionMessageIndex!].text =
+              '${_messages[_outputTranscriptionMessageIndex!].text}${transcription.text!}';
+        } else {
+          _messages.add(
+            MessageData(
+              text: 'Output transcription: ${transcription.text!}',
+              fromUser: false,
+            ),
+          );
+          _outputTranscriptionMessageIndex = _messages.length - 1;
+        }
+        if (transcription.finished ?? false) {
+          _outputTranscriptionMessageIndex = null;
+        }
+        setState(_scrollDown);
+      }


The logic for handling inputTranscription and outputTranscription is nearly identical. This duplication can be reduced by extracting the common logic into a private helper method. This would make the _handleLiveServerMessage method shorter, cleaner, and easier to maintain.

gemini-code-assist · 2025-10-16T22:15:24Z

packages/firebase_ai/firebase_ai/example/lib/widgets/message_widget.dart

  });
  final Uint8List? imageBytes;
-  final String? text;
+  String? text;


Changing text to be a mutable field makes the MessageData class mutable. In Flutter, it's a best practice to use immutable data classes to prevent unexpected side effects and make state management more predictable.

I recommend reverting this to a final field and adding a copyWith method to MessageData. This allows you to create a new instance with updated data, which aligns better with declarative UI principles.

Here's an example of how you could implement copyWith:

class MessageData { // ... existing final properties MessageData copyWith({ String? text, // ... other properties }) { return MessageData( text: text ?? this.text, // ... other properties ); } }

Suggested change

String? text;

final String? text;

gemini-code-assist · 2025-10-16T22:15:24Z

packages/firebase_ai/firebase_ai/lib/src/live_api.dart

+    Transcription? inputTranscription;
+    if (serverContentJson.containsKey('inputTranscription')) {
+      final transcriptionJson =
+          serverContentJson['inputTranscription'] as Map<String, dynamic>;
+      inputTranscription = Transcription(
+        text: transcriptionJson['text'] as String?,
+        finished: transcriptionJson['finished'] as bool?,
+      );
+    }
+    Transcription? outputTranscription;
+    if (serverContentJson.containsKey('outputTranscription')) {
+      final transcriptionJson =
+          serverContentJson['outputTranscription'] as Map<String, dynamic>;
+      outputTranscription = Transcription(
+        text: transcriptionJson['text'] as String?,
+        finished: transcriptionJson['finished'] as bool?,
+      );
+    }
+    return LiveServerContent(
+      modelTurn: modelTurn,
+      turnComplete: turnComplete,
+      interrupted: interrupted,
+      inputTranscription: inputTranscription,
+      outputTranscription: outputTranscription,
+    );


The logic for parsing inputTranscription and outputTranscription is duplicated. To improve code quality and reduce redundancy, you can extract this logic into a local helper function within _parseServerMessage.

Suggested change

Transcription? inputTranscription;

if (serverContentJson.containsKey('inputTranscription')) {

final transcriptionJson =

serverContentJson['inputTranscription'] as Map<String, dynamic>;

inputTranscription = Transcription(

text: transcriptionJson['text'] as String?,

finished: transcriptionJson['finished'] as bool?,

);

}

Transcription? outputTranscription;

if (serverContentJson.containsKey('outputTranscription')) {

final transcriptionJson =

serverContentJson['outputTranscription'] as Map<String, dynamic>;

outputTranscription = Transcription(

text: transcriptionJson['text'] as String?,

finished: transcriptionJson['finished'] as bool?,

);

}

return LiveServerContent(

modelTurn: modelTurn,

turnComplete: turnComplete,

interrupted: interrupted,

inputTranscription: inputTranscription,

outputTranscription: outputTranscription,

);

Transcription? parseTranscription(String key) {

if (serverContentJson.containsKey(key)) {

final transcriptionJson =

serverContentJson[key] as Map<String, dynamic>;

return Transcription(

text: transcriptionJson['text'] as String?,

finished: transcriptionJson['finished'] as bool?,

);

}

return null;

}

return LiveServerContent(

modelTurn: modelTurn,

turnComplete: turnComplete,

interrupted: serverContentJson['interrupted'] as bool?,

inputTranscription: parseTranscription('inputTranscription'),

outputTranscription: parseTranscription('outputTranscription'),

);

gemini-code-assist · 2025-10-16T22:15:24Z

packages/firebase_ai/firebase_ai/lib/src/live_model.dart

+        if (_liveGenerationConfig != null &&
+            _liveGenerationConfig.inputAudioTranscription != null)
+          'input_audio_transcription':
+              _liveGenerationConfig.inputAudioTranscription!.toJson(),
+        if (_liveGenerationConfig != null &&
+            _liveGenerationConfig.outputAudioTranscription != null)
+          'output_audio_transcription':
+              _liveGenerationConfig.outputAudioTranscription!.toJson(),


The check for _liveGenerationConfig != null is repeated. You can make this code cleaner and more readable by using a local variable and a collection-if with a spread operator.

final liveConfig = _liveGenerationConfig; if (liveConfig != null) ...{ if (liveConfig.inputAudioTranscription != null) 'input_audio_transcription': liveConfig.inputAudioTranscription!.toJson(), if (liveConfig.outputAudioTranscription != null) 'output_audio_transcription': liveConfig.outputAudioTranscription!.toJson(), },

cynthiajoan added 3 commits September 4, 2025 22:38

Add transcription into server content

edb0fe0

Add transcription config

5b1f695

transcription working

0628d73

cynthiajoan added 2 commits October 16, 2025 14:59

Merge branch 'main' into firebaseai/bidi_transcript

b578cf8

add test for transcription

852e9f4

cynthiajoan changed the title ~~feat(firebaseai): bidi transcript~~ feat(firebaseai): add bidi transcript Oct 16, 2025

cynthiajoan marked this pull request as ready for review October 16, 2025 22:13

cynthiajoan requested review from a-maurice and russellwheatley October 16, 2025 22:13

gemini-code-assist bot reviewed Oct 16, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(firebaseai): add bidi transcript #17700

feat(firebaseai): add bidi transcript #17700

Uh oh!

cynthiajoan commented Sep 5, 2025

Uh oh!

cynthiajoan commented Sep 30, 2025

Uh oh!

gemini-code-assist bot commented Sep 30, 2025

Uh oh!

cynthiajoan commented Oct 16, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Oct 16, 2025

Uh oh!

gemini-code-assist bot Oct 16, 2025

Uh oh!

gemini-code-assist bot Oct 16, 2025

Uh oh!

gemini-code-assist bot Oct 16, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

feat(firebaseai): add bidi transcript #17700

Are you sure you want to change the base?

feat(firebaseai): add bidi transcript #17700

Uh oh!

Conversation

cynthiajoan commented Sep 5, 2025

Description

Related Issues

Checklist

Breaking Change

Uh oh!

cynthiajoan commented Sep 30, 2025

Uh oh!

gemini-code-assist bot commented Sep 30, 2025

Uh oh!

cynthiajoan commented Oct 16, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Oct 16, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Oct 16, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Oct 16, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Oct 16, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant